NLS: A Non-Latent Similarity Algorithm
نویسندگان
چکیده
This paper introduces a new algorithm for calculating semantic similarity within and between texts. We refer to this algorithm as NLS, for Non-Latent Similarity. This algorithm makes use of a second-order similarity matrix (SOM) based on the cosine of the vectors from a first-order (non-latent) matrix. This first-order matrix (FOM) could be generated in any number of ways; here we used a method modified from Lin (1998). Our question regarded the ability of NLS to predict word associations. We compared NLS to both Latent Semantic Analysis (LSA) and the FOM. Across two sets of norms, we found that LSA, NLS, and FOM were equally predictive of associates to modifiers and verbs. However, the NLS and FOM algorithms better predicted associates to nouns than did LSA.
منابع مشابه
Group Sparsity Residual with Non-Local Samples for Image Denoising
Inspired by group-based sparse coding, recently proposed group sparsity residual (GSR) scheme has demonstrated superior performance in image processing. However, one challenge in GSR is to estimate the residual by using a proper reference of the group-based sparse coding (GSC), which is desired to be as close to the truth as possible. Previous researches utilized the estimations from other algo...
متن کاملA Comparative Study of Machine Learning Approaches- SVM and LS-SVM using a Web Search Engine Based Application
Semantic similarity refers to the concept by which a set of documents or words within the documents are assigned a weight based on their meaning. The accurate measurement of such similarity plays important roles in Natural language Processing and Information Retrieval tasks such as Query Expansion and Word Sense Disambiguation. Page counts and snippets retrieved by the search engines help to me...
متن کاملlsemantica: A Stata Command for Text Similarity based on Latent Semantic Analysis
The lsemantica command, presented in this paper, implements Latent Semantic Analysis in Stata. Latent Semantic Analysis is a machine learning algorithm for word and text similarity comparison. Latent Semantic Analysis uses Truncated Singular Value Decomposition to derive the hidden semantic relationships between words and texts. lsemantica provides a simple command for Latent Semantic Analysis ...
متن کاملRelationship Matrix Nonnegative Decomposition for Clustering
Nonnegative matrix factorization NMF is a popular tool for analyzing the latent structure of nonnegative data. For a positive pairwise similarity matrix, symmetric NMF SNMF and weighted NMF WNMF can be used to cluster the data. However, both of them are not very efficient for the ill-structured pairwise similarity matrix. In this paper, a novel model, called relationship matrix nonnegative deco...
متن کاملComparison of cosine similarity and k-NN for automated essays scoring
In this paper, a comparison between Cosine Similarity and k-Nearest Neighbors algorithm in Latent Semantic Analysis method to score Arabic essays automatically is presented. It also improves Latent Semantic Analysis by processing the entered text, unifying the form of letters, deleting the formatting, replacing synonyms, stemming and deleting "Stop Words". The results showed that the use of Cos...
متن کامل